zeroize: zeroize entire capacity of `Vec` #341

aticu · 2020-01-28T08:06:29Z

Re-implement #180, but without any additional required trait bounds, which is why #180 was reverted in #276.

tony-iqlusion · 2020-01-28T13:54:35Z

zeroize/src/lib.rs

+            // - The total size of the slice must be no larger than `isize::MAX`.
+            //       This is true, because `Vec` never allocates more than `isize::MAX` bytes.
+            let extra_capacity = unsafe {
+                core::slice::from_raw_parts_mut(


This appears to create a slice of uninitialized memory, to give it to the Zeroize impl on [Z], which iterates over it and performs ptr::write_volatile.

It might make sense instead to perform ptr::write_volatile here directly, followed by atomic_fence(), to avoid ever constructing an uninitialized slice.

Alternatively this was recently stabilized: rust-lang/rust#68234

I don't quite see how constructing a *mut [T] helps. It's main intent is for data structures storing raw pointer or a NonNull<T> without lifetime information but here it would still require somehow manually iterating over the individual element pointers to do a volatile write. In particular, the implementation of Z::zeroize(&mut self) can still not be called without UB.

tony-iqlusion

I think this looks ok now, but will run it by a few others before merging

aticu · 2020-01-29T16:07:29Z

Thanks! By the way, I just noticed that a similar thing might make sense for String. If you want, I could also implement that.

Oh and I just noticed the CONTRIBUTING.md file. Should I add another commit adding myself to the AUTHORS.md?

tony-iqlusion · 2020-01-29T16:15:52Z

Should I add another commit adding myself to the AUTHORS.md?

Yes, that'd be good.

I just noticed that a similar thing might make sense for String. If you want, I could also implement that.

Sure, but I'd prefer a separate PR so it doesn't block this one.

tony-iqlusion · 2020-01-29T16:19:41Z

zeroize/src/lib.rs

+            let extra_capacity_start = unsafe { self.as_mut_ptr().add(self.len()) as *mut u8 };
+            let extra_capacity_len = self.capacity().saturating_sub(self.len());
+
+            for i in 0..(extra_capacity_len * core::mem::size_of::<Z>()) {


I think there's a simpler option than casting to a *mut u8, which is using mem::zeroed to produce the all-zero byte pattern for Z, and ptr::write_volatile to write it, e.g.

for i in 0..extra_capacity_len { unsafe { ptr::write_volatile(extra_capacity_start.add(i), mem::zeroed()); } }

I considered that, but it would result in undefined behavior, when Z is not a type where an all-zero pattern is a valid value. However now that I think about it again, it wouldn't make much sense to implement Zeroize for such a type. I'll adjust that.

Actually on second thought, if it were implemented that way, it would be possible to invoke undefined behavior from safe code. For example this would then result in undefined behavior:

use core::num::NonZeroU8; use zeroize::Zeroize; struct NonZero(NonZeroU8); impl Zeroize for NonZero { fn zeroize(&mut self) { self.0 = NonZeroU8::new(1).unwrap(); } } fn main() { let mut vec = vec![NonZero(NonZeroU8::new(2).unwrap())]; vec.clear(); // undefined behavior: this would create a `NonZeroU8` with a // memory-representation of all-zeroes while zeroing the // uninitialized memory vec.zeroize(); }

This may not be a very useful implementation of Zeroize, but it would be a way to invoke undefined behavior in safe code. The other version does not have this problem.

How would the uninitialized memory be exposed to safe code?

Regardless, the net effect is the same: you are writing zeroed bytes to the excess capacity.

How would the uninitialized memory be exposed to safe code?

It doesn't need to. According to the doc of mem::zeroed:

There is no guarantee that an all-zero byte-pattern represents a valid value of some type T. For example, the all-zero byte-pattern is not a valid value for reference types (&T and &mut T). Using zeroed on such types causes immediate undefined behavior because the Rust compiler assumes that there always is a valid value in a variable it considers initialized.

Which links to a section in the docs about undefined behavior:

It is the programmer's responsibility when writing unsafe code to ensure that any safe code interacting with the unsafe code cannot trigger these behaviors. unsafe code that satisfies this property for any safe client is called sound; if unsafe code can be misused by safe code to exhibit undefined behavior, it is unsound.

The same page also states that

Rust code is incorrect if it exhibits any of the behaviors in the following list.

[...]

Invalid values for a type with a custom definition of invalid values. In the standard library, this affects NonNull<T> and NonZero*.

The example above would therefore be both unsound and incorrect, which is possible from safe code. The values do not need to be accessible by safe code to create the problem here.

With the method which zeroes the memory using u8 this would not be the case, as u8 is guaranteed to be able to hold a value of 0.

What is a better solution?

The implementation, as currently written in this PR.

It still seems like 6 of one, half dozen of another to me, except looping a byte-at-a-time and doing pointer arithmetic around core::mem::zeroed::<NonNull<T>>() is more complex and potentially slower.

It seems the crux of this is...

Memory behind a raw pointer can hold any value without any problem, because Rust does not consider it to be initialized. The problem only arises when an invalid value of type T exists, which Rust considers initialized, which mem::zeroed() does.

...but we're talking about a buffer which is defined as being:

What does a normal Vec::with_capacity call initialize it to?

Nothing, it does not create a value of type T though and just leaves the memory completely uninitialized.

The contract of Vec is always to initialize this capacity in some way before reading from it.

It seems the claim is it might be considered initialized by the Rust compiler when it is uninitialized, but before its API permits any reads, it will be initialized again.

Can either of you give a concrete example of a tractable problem which is possible given the contract of Vec to always (re)initialize this memory which would not occur with pointer-based zeroing?

What you are doing to the excess buffer of the vector is and never was the problem. You are indeed free to do with the extra capacity of bytes whatever you want. But not however you want. The code that writes invalid or uninitialized values into it must still be UB-free and executing mem::zeroed::<T> is not.

Can either of you give a concrete example of a tractable problem which is possible given the contract of Vec to always (re)initialize this memory which would not occur with pointer-based zeroing?

The problem is not the Vec or its memory, the problem is simply that the code could call mem::zeroed<NonNull<()>>, which is UB. The compiler might make optimizations that would crash the program if such a value ever exists. This would be the same problem in every program, even without a Vec being involved.

Okay, I think I get it now. Thank you for the explanation.

I, Niclas Schwarzlose, hereby agree to license all contributions I make to this project under the terms of the Apache License, Version 2.0.

aticu requested a review from tony-iqlusion January 28, 2020 08:19

tony-iqlusion reviewed Jan 28, 2020

View reviewed changes

zeroize: zeroize entire capacity of Vec

14b7c58

aticu requested a review from tony-iqlusion January 29, 2020 10:44

tony-iqlusion approved these changes Jan 29, 2020

View reviewed changes

tony-iqlusion reviewed Jan 29, 2020

View reviewed changes

aticu and others added 2 commits January 29, 2020 18:54

AUTHORS.md: adding @aticu and licensing my contributions

c097ed7

I, Niclas Schwarzlose, hereby agree to license all contributions I make to this project under the terms of the Apache License, Version 2.0.

Merge branch 'develop' into develop

78aa2f5

tony-iqlusion merged commit 9a2610e into iqlusioninc:develop Jan 30, 2020

tony-iqlusion mentioned this pull request Sep 15, 2020

zeroize 1.1.1 #507

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zeroize: zeroize entire capacity of `Vec` #341

zeroize: zeroize entire capacity of `Vec` #341

aticu commented Jan 28, 2020

tony-iqlusion Jan 28, 2020

tony-iqlusion Jan 28, 2020

HeroicKatora Jan 28, 2020 •

edited

tony-iqlusion left a comment

aticu commented Jan 29, 2020

tony-iqlusion commented Jan 29, 2020

tony-iqlusion Jan 29, 2020 •

edited

aticu Jan 29, 2020

aticu Jan 29, 2020

tony-iqlusion Jan 29, 2020 •

edited

aticu Jan 30, 2020

HeroicKatora Jan 30, 2020 •

edited

tony-iqlusion Jan 30, 2020 •

edited

HeroicKatora Jan 30, 2020

aticu Jan 30, 2020

tony-iqlusion Jan 30, 2020

zeroize: zeroize entire capacity of Vec #341

zeroize: zeroize entire capacity of Vec #341

Conversation

aticu commented Jan 28, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HeroicKatora Jan 28, 2020 • edited

Choose a reason for hiding this comment

tony-iqlusion left a comment

Choose a reason for hiding this comment

aticu commented Jan 29, 2020

tony-iqlusion commented Jan 29, 2020

tony-iqlusion Jan 29, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tony-iqlusion Jan 29, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HeroicKatora Jan 30, 2020 • edited

Choose a reason for hiding this comment

tony-iqlusion Jan 30, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zeroize: zeroize entire capacity of `Vec` #341

zeroize: zeroize entire capacity of `Vec` #341

HeroicKatora Jan 28, 2020 •

edited

tony-iqlusion Jan 29, 2020 •

edited

tony-iqlusion Jan 29, 2020 •

edited

HeroicKatora Jan 30, 2020 •

edited

tony-iqlusion Jan 30, 2020 •

edited